2021-08-05
Explaining River Layouts
… in which the author will explain layouts of the river Wayland compositor, from the perspective of being one of developers of the layout protocol. This post is for those who are interested in creating their own layouts or simply want to gain a deeper understanding of how their favourite Wayland desktop works.
The alternate title for this post is "every question on layouts that
has ever been answered on #river
".
This article was originally written on 2021-08-05 and has seen minor updates since then.
What are river layouts?
In river, the term "layout" is used to describe a set of coordinates and dimensions for a specified number of views1. Layouts are generated by external programs which I will call layout generators. When river wants to arrange some views, it asks the active layout generator for a layout, a process that is called layout demand2.
Layout generators communicate with river via a custom Wayland protocol extension and are therefore just simple Wayland clients, with no special treatment. As such they are allowed to connect and disconnect at any time3. As constantly running processes they can keep state between layout demands. This is useful for implementing features such as having different layouts per tag, which river does not implement itself.
River internally uses a simple, strictly linear list to keep track of all views. This is reflected in common view operations, like focus or move, which can only go forwards or backwards and do not take into account the actual relative positions of views on screen. With this in mind, we deemed overly complicated layouts impractical and not worth the complexity when we designed the layout. Therefore the layout demand process is subject to intentional limitations.
A layout generator may of course use complex data structures, like binary trees, to generate the layout, but all additional data and context is lost in transmission. River will always treat the views like a list and you, the user, can always only move up and down this list. Layout generators are not window managers. Keep this in mind when creating your own layouts.
The only information that is send in a layout demand is the number of views in the layout, the usable space (meaning the part of the screen which is unobstructed by widgets such as status bars), the currently focused tags and finally a serial which is used to identify and group responses to the layout demand. I will cover responding to layout demands later in this post.
Layouts may have additional values they depend on. For example rivertile, rivers default and reference layout generator, which features a main-area and a stack-area similar to the default layout in dwm, has three additional values. My own layout generator, stacktile, has ten. To configure these at runtime, the layout protocol has user commands. These are simply user provided strings river forwards to the layout generator. In fact, they can have more interesting uses than just changing values. You could use them to send any command to your layout generator. It is even possible to create a generator that can receive layout descriptions in some custom language via user commands, like this one.
This is a good time to point out that this post is not a tutorial for writing Wayland clients. It is simply meant to provide a high-level overview of how layouts work combined with a small collection of tips and tricks.
When is a layout generator active?
Somewhat opaque to the user, layout generators have a layout object per output they want to handle. This layout object has a namespace which is used to reliably identify it in a reproducible manner (a namespace will pretty much always belong to the same layout generator after a reboot).
Namespaces are unique per output and bound to clients. No two layout objects can exist for the same output with the same namespace and no two clients can have layout objects with the same namespace, even across outputs. This seems complicated, but actually rules out a ton of possible confusion. Users can be absolutely certain that the same namespace across different outputs always belongs to the same layout generator.
It is of course possible that a layout generator tries to register a namespace that is already in use. We decided to make this a non-fatal error, since it really is not the clients fault. Instead, the client will receive a simple event informing it of this unfortunate circumstance and all requests it sends thereafter will be ignored. A sophisticated client could recover, perhaps even try to register the layout object again but with a different namespace.
To set the active layout, users can specify a namespace. If this namespace matches the namespace of a layout object on that output, it will be sent a layout demand and after that has been successfully handled, the layout object is considered active. The user can specify a namespace to use at any time, so this can be used to switch between different layouts. In practice however it turns out that people prefer having a single layout generator that is flexible enough to suit all their needs instead of smaller specialized ones.
How to respond to a layout demand?
There is exactly one correct way to respond to a layout demand: The
client is expected to send the push_view_dimensions
request exactly
as often as there are views in the layouts and after that send the
commit
request. Pushing more or less view dimensions is a fatal
protocol error.
Pushed view dimensions are applied to rivers view list in the exact order they were pushed. You can not just set the view dimensions of an arbitrary view, you have to send them all in sequence, top to bottom. As I already pointed out, all context you used to calculate the view dimensions is lost at this point. All you can send river is an anonymous set of coordinates together with width and height. This seems restrictive at first glance, but actually makes it possible to cleanly split your layout logic into multiple independent sections (more on calculating layouts in a later section).
When working with graphics programming for the first time, there is
something that might confuse you initially: The coordinate origin is
at the top left corner, not the bottom left corner like you are used
to from maths. This might take some time getting used to. On the
bright side, you do not need to account for the position of the usable
area: For layout generators, the coordinates 0,0
are at the start of
the usable area; River will automatically offset the entire layout to
place it where it belongs.
When committing a layout, you need to provide a layout name. This is
a simple string that river in the future will forward to programs like
status bars to display. This is meant to make it possible to replicate
dwms behaviour, which also displays the name of the layout in its
builtin status bar (for example []\=
for the default layout). But
unlike dwm, in river layouts can - and are encouraged to - have a
dynamic layout names. Instead of just static strings, layout
generators can use this to display useful information to the user,
like for example how many views are currently hidden in an overflow
area or the current layout configuration.
All responses to a layout demand need to use its serial, so river knows to which layout demand your layout generator is responding to. This is important for cases like when river sends a new layout demand while your client still handles the old one. In theory a layout generator should stop handling the old layout demand and start working on the new one immediately, but in practice that would massively increase code complexity for clients. It's a somewhat rare event so that you need not worry about wasting a few cycles.
This is not mentioned in the protocol, but if your layout generator is too slow, a layout demand may time out. In that case, river will fall back to floating mode. This is completely opaque to the client. In practice I can not imagine a layout so complex it can not be calculated within the allowed time frame, but the timeout certainly can be an issue when creating a layout generator that uses user provided code for layouts. Even if the user provided code hits an endless loop or is just very slow, river will not be locked up.
How to handle user commands?
User commands are just strings. When writing a layout generator, it is entirely your responsibility to do the parsing, river just forwards you what the user sent.
You are free to use some common format and parsing library, or to just
roll your own. If you do the latter, my only recommendation is to
consider using a language that can handle string slices instead of
just NULL
terminated strings to simplify tokenizing and
to avoid unnecessary allocations.
Of course, the user command string is not guaranteed to be valid: It
may have an unrecognizable format or the values it sets lead to
unwanted behaviour, like integer overflow. Again, it is your
responsibility to handle this. If you want to print error messages,
just dump them to stderr
. A user interested in seeing them can just
run your layout generator in a terminal.
If you want to be really fancy, you can try to write a layout generator that makes use of some common scripting language, like lua or scheme, to generate the layouts and then use the user commands as some sort of REPL (just without the P part). Maybe even allow re-defining the layout function this way. If you are an emacs user, this may be your chance to integrate river into your favourite editor!
After a user command, if your layout generator is currently the active one and if there are visible views on that output, river will start a new layout demand, so whatever commands the user sent to your generator can apply as soon as possible. Note that river really does not care whether the user command was valid and actually changes the layout in any way. The few cycles saved by not sending a layout demand in those cases are simply not worth the extra round-trip. And in a complete setup, virtually all user commands are correct since they will be send from keybinds defined in an init script that has already been debugged.
What to consider when designing a layout?
Note that while the code in this section looks suspiciously like zig, it really is just pseudo-code. It completely ignores type-casting, among other things, in order to be a bit more readable.
As I mentioned earlier, all you really have to do is send the pushviewdimensions request for every visible view. These events can be send from code paths that are otherwise independent from each other. This makes it possible to build a metaphorical tool box of sorts containing modular building blocks you can later stitch together to form a complete layout.
A good start is realizing that coordinates plus width and height always appear together, so you might want to create a struct / type bundling them together. The common name for such a construct is box.
const Box = struct { x: i32, y: i32, width: u32, height: u32, };
Most layouts have at least two parts, the "main" area and the "remainder" area (also sometimes called "master" and "stack", but those names are not very descriptive, so I chose better ones). Some even have more, like my own which has three areas. So think about splitting a area into two areas. Basically what you want is a function that takes one box and returns two boxen.
/// Shrink the box by half, returning a box containing the other space. pub fn splitOffRight(self: *Box) Box { self.width = self.width / 2; return Box{ .width = self.width, .height = self.height, .x = self.x + self.width, .y = self.y, }; }
You probably want to control the proportional sizes of the two new
areas, for which you probably want to use a floating point ratio. That
is no more complicated than the above example, if you know some simple
maths: 60% of a
can be expressed as 0.6 * a
.
/// Shrink the box by half, returning a box containing the other space. pub fn splitOffRight(self: *Box, ratio: f32) Box { std.debug.assert(ratio < 1.0 and ratio > 0.0); const old_width = self.width; self.width = self.width * ratio; return Box{ .width = old_width - self.width, .height = self.height, .x = self.x + self.width, .y = self.y, }; }
You also probably want to have some sort of space between the areas. Rivertile implements a somewhat naive padding by simply shrinking view dimensions by a configurable amount which works well for its simple layout, but I think "real" gaps are a bit nicer. Luckily, that is not hard to achieve either. Just keep in mind that your space orthogonal to the split is now decreased and that one of the boxes needs an additional offset.
/// Shrink the box by half, returning a box containing the other space. pub fn splitOffRight(self: *Box, ratio: f32, gap: u32) Box { std.debug.assert(ratio < 1.0 and ratio > 0.0); const old_width = self.width; self.width = (self.width * ratio) - (gap / 2); return Box{ .width = old_width - self.width, .height = self.height, .x = self.x + self.width + gap, .y = self.y, }; }
Congratulations, you are now capable of splitting an area into two areas! In fact, this is already enough to create a Fibonacci spiral layout.
pub fn fibonacciSpiral(self: *Box, view_count: u32) void { var i: usize = 0; while (i < view_count - 1) : (i += 1) { const off_box = blk: { if (i % 2 == 0) { if (i % 4 == 2) { break :blk self.splitOffRight(); } else { break :blk self.splitOffLeft(); } } else { if (i % 4 == 3) { break :blk self.splitOffTop(); } else { break :blk self.splitOffBottom(); } } }; off_box.pushViewDimensions(); } self.pushViewDimensions(); }
See how the Fibonacci spiral function also takes a box as input? Don't stop at making just your simple geometry helper functions modular, do the same for entire layouts! Because of this simple trick, I can now take the Fibonacci spiral and plug it into a different layout4. For example, here is a layout that has a main area with one view on the left, and the remaining views are sorted into a spiral on the right.
pub fn weirdoLayout(self: *Box, view_count: u32) void { if (view_count == 1) { self.pushViewDimensions(); return; } const main_area = self.splitOffLeft(); main_area.pushViewDimensions(); self.fibonacciSpiral(view_count - 1); }
As you see, even the most basic operation, splitting one box into two,
allows you to create complex and fun layouts. The possibilities are
already endless (literally!), but you probably want a few more
operations, like for example functions that turn a box into
n
columns or rows (which you will need if you want to
recreate common layouts). Creating these additional operations is left
as an exercise to the reader.
I might write a more in-depth article about layouts from a purely mathematical perspective later, but for now you have everything you need to get started.
Happy hacking!
Footnotes:
"view" is Wayland-speak for what you, valued reader, probably call "window". Another common term for this is "toplevel". Although all three can have, depending on context, slightly different meanings, in this informal, laissez-fair blog I will use all three interchangeably. Sue me. (Don't actually sue me please)
I initially wanted to call this layout request, which is in my opinion a better fitting word. But because the term request is already used in Wayland and because the layout requests are actually events in the protocol, I choose a different word to avoid confusion.
If no layout generator is active, river will simply fall back to using to floating window management for all views. You can kill and re-launch the currently active layout generator without loosing your session, which makes it very pleasant to hack on them. Even better, a faulty layout generator will not bring down river: If it crashes, it simply disconnects; If it takes to long, the timeout kicks in and river will ignore it.
If you follow this pattern closely, you will at some point realize that operating on fixed size lists is a good use case for recursion. See also: The Lisp family of programming languages.